12 results found.
Written
Corpus,
Language Type:
Trilingual
Languages:
Bosnian Croatian Serbian
Availability:
Freely Available
License:
CC-BY-SA 3.0
Size:
1909886930 Production Status:
Newly created-in progress
Use:
Paper:
N/A
Documentation:
<Not Specified>
Written
Language Identifier,
Language Type:
Trilingual
Languages:
Bosnian Croatian Serbian
Availability:
Freely Available
License:
LGPL
Size:
6 Production Status:
Newly created-finished
Use:
Language Identification
Paper:
N/A
Documentation:
<Not Specified>
Written
Corpus,
Language Type:
Trilingual
Languages:
Bosnian Croatian Serbian
Availability:
Freely Available
License:
CC-BY-SA 3.0
Size:
894230686 Production Status:
Newly created-in progress
Use:
Paper:
N/A
Documentation:
<Not Specified>
Written
Corpus,
Language Type:
Trilingual
Languages:
Bosnian Croatian Serbian
Availability:
Freely Available
License:
CC-BY-SA 3.0
Size:
428925567 Production Status:
Newly created-in progress
Use:
Paper:
N/A
Documentation:
<Not Specified>
Written
Corpus,
Language Type:
Monolingual
Languages:
Afrikaans Albanian Arabic Armenian Bangla Basque Bosnian Breton Bulgarian Catalan Croatian Czech Danish Dutch English Esperanto Estonian Filipino Finnish French Galician Georgian German Greek Hebrew Hindi Hungarian Icelandic Indonesian Italian Japanese Kazakh Korean Latvian Lithuanian Macedonian Malay Malayalam Norwegian Persian Polish Portuguese Romanian Russian Serbian Sinhala Slovak Slovenian Spanish Swedish Tamil Telugu Thai Turkish Ukrainian Urdu Vietnamese pt_br ze_en ze_zh zh_cn zh_tw
Availability:
Freely Available
License:
<Not Specified>
Size:
22.10G tokens Production Status:
Existing-used
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:word2word: A Collection of Bilingual Lexicons for 3,564 Language Pairs
-
Paper track:Written/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Yo Joong Choe | OpenSubtitles2018 | /N |
Documentation:
Yes, on the website.
Written
Lexicon,
Language Type:
Monolingual
Languages:
Afrikaans Albanian Arabic Armenian Bangla Basque Bosnian Breton Bulgarian Catalan Croatian Czech Danish Dutch English Esperanto Estonian Filipino Finnish French Galician Georgian German Greek Hebrew Hindi Hungarian Icelandic Indonesian Italian Japanese Kazakh Korean Latvian Lithuanian Macedonian Malay Malayalam Norwegian Persian Polish Portuguese Romanian Russian Serbian Sinhala Slovak Slovenian Spanish Swedish Tamil Telugu Thai Turkish Ukrainian Urdu Vietnamese pt_br ze_en ze_zh zh_cn zh_tw
Availability:
Freely Available
License:
CreativeCommons Attribution 4.0 International
Size:
41 GByte Production Status:
Newly created-finished
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:word2word: A Collection of Bilingual Lexicons for 3,564 Language Pairs
-
Paper track:Written/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Yo Joong Choe | word2word | /N |
Documentation:
Yes, on the website.
Written
Corpus,
Language Type:
Multilingual
Languages:
Afrikaans Albanian Amharic Arabic Aragonese Armenian Assamese Azerbaijani Basque Belarusian Bengali Bosnian Breton Bulgarian Burmese Catalan Central Khmer Chinese Croatian Czech Danish Dutch Dzongkha English Esperanto Estonian Finnish French Gaelic Galician Georgian German Greek Gujarati Hausa Hebrew Hindi Hungarian Icelandic Igbo Indonesian Irish Italian Japanese Kannada Kazakh Kinyarwanda Korean Kurdish Kyrgyz Latvian Limburgan Lithuanian Macedonian Malagasy Malay Malayalam Maltese Marathi Mongolian Nepali Northern Sami Norwegian Norwegian Bokmål Norwegian Nynorsk Occitan Oriya Panjabi Pashto Persian Polish Portuguese Romanian Russian Serbian Serbo-Croatian Sinhala Slovak Slovenian Spanish Swedish Tajik Tamil Tatar Telugu Thai Turkish Turkmen Uighur Ukrainian Urdu Uzbek Vietnamese Walloon Welsh Western Frisian Xhosa Yiddish Yoruba Zulu
Availability:
Freely Available
License:
Size:
55 million sentences Production Status:
Existing-used
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:Improving Massively Multilingual Neural Machine Translation and Zero-Shot Translation
-
Paper track:Long/Machine Translation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Biao Zhang | the open parallel corpus (OPUS) | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Monolingual
Languages:
Amharic Bosnian Croatian Dari English French Georgian Haitian Hausa Hindi Korean Mandarin Chinese Persian Portuguese Pushto Russian Spanish Turkish Ukrainian Urdu Vietnamese Yue Chinese
Availability:
From Owner
License:
LDC
Size:
215 hours Production Status:
Existing-used
Use:
Language Identification
-
Paper title:Metric learning loss functions to reduce domain mismatch in the x-vector space for language recognition
-
Paper track:4.1 Language identification and verification, lang/Oral Presentation
-
Paper status:Accept - Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Raphaël Duroselle | 2009 NIST Language Recognition Evaluation Test Set | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Multilingual
Languages:
Amharic Bosnian Croatian Dari English French Georgian Haitian Hausa Hindi Korean Mandarin Chinese Persian Portuguese Pushto Russian Spanish Turkish Ukrainian Urdu Vietnamese Yue Chinese
Availability:
From Owner
License:
LDC
Size:
215 hoursProduction Status:
Existing-used
Use:
Language Identification
-
Paper title:Modeling and training strategies for language recognition systems
-
Paper track:4.1 Language identification and verification, lang/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Raphaël Duroselle | 2009 NIST Language Recognition Evaluation Test Set | /N |
Documentation:
NoneLanguage Type:
Multilingual
Languages:
Bosnian Croatian Indonesian Malay Serbian
Availability:
Freely Available
License:
<Not Specified>
Size:
280000 sentences Production Status:
Existing-updated
Use:
Language Identification
-
Paper title:Discriminating Similar Languages: Evaluations and Explorations
-
Paper track:Evaluation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Cyril Goutte | National Research Council Canada | CA |
| Author 2 | Serge Léger | National Research Council Canada | CA |
| Author 3 | Shervin Malmasi | Macquarie University | AU |
| Author 4 | Marcos Zampieri | Saarland University | DE |
| Main Contact | Cyril Goutte | National Research Council Canada | None |
Documentation:
Tan et al. (2014) Proc. BUCC.




